"UKON-Fischer-MC3"

VAST 2013 Challenge
Mini-Challenge 3: Visual Analytics for Network Situation Awareness

 

 

Team Members:

Fabian Fischer, University of Konstanz, Fabian.Fischer@uni-konstanz.de (PRIMARY)
Daniel A. Keim, University of Konstanz, Daniel.Keim@uni-konstanz.de

 

Student Team:  No

 

Analytic Tools Used:

Elasticsearch - Open Source Distributed Real Time Search & Analytics (data storage)

VACS - Visual Analytics Suite for Cyber Security (self-written and adapted for challenge)

ClockMap - Enhanced Circular Treemaps with Temporal Glyphs for Time-Series Data (self-written and adapted for clustering)

Programming - HTML5/JavaScript/Java

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2013 is complete? Yes

 

 

 

Video:

ukon-fischer-mc3-video.wmv

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC3.1 – Provide a timeline (i.e., events organized in chronological order) of the notable events that occur in Big Marketing’s computer networks for the two weeks of supplied data. Use all data at your disposal to identify up to twelve events and describe them to the extent possible.  Your answer should be no more than 1000 words long and may contain up to twelve images.

 

1)      Major Events:

 

There are 5 major peaks in the network with over 50 000 thousand Flows/Minute. This is something, we definitely have to report one. Further analysis with our tool VACS reveal more information about the different points in time. In this case, we loaded several time-series to identify the origin of the flow records. The red colored line represent flows coming from the Internet to our network, while the blue line represents the flows coming from our local network. The almost not visible yellow line represents the total flows. This helps us to identify different interesting patterns, which might be of great interest to the system administrators.

 

 

2)      First Peak: 2013-04-02 05:10 -> 07:10

 

The following treemap for the selected timespan at the first peak reveals that most flows come the Internet to destination port 80. This could be massive DDOS or a major campaign, which is massively accessing our servers. Either way, the administrators should think about using external caching services, to mitigate such access patterns.

 

Correlating this with all of our web servers, reveals that this event does only involve one of our web servers, with the address 172.30.0.4. The other web servers are not affected by this possible attack. In the treemap representation underneath, also the originating source IPs are visible, which would give the analyst an hint, if this is a normal campaign response or a DDOS attack.

 

 

3)      Second Peak: 2013-04-03 08:00 -> 12:00 and still struggling from 12:00 -> 2013-04-04 08:00

 

4)      Third Peak: 2013-04-06 11:00 -> 12:00 (just the highest peak)

 

As a result of this attack the administrators actually took the whole network offline. And this for good reason, because this DDOS is quite different from the former ones. The others do primarily focus on port 80 and might just be DDOS or ongoing marketing campaigns. However, this time we have a different pattern, because the main target is not port 80 as seen in the chart without port 80 traffic:

 

 

5)      Fourth Peak: 2013-04-11 11:30 -> 13:00

 

6)      Fifth Peak: 2013-04-14 14:00 -> 15:15

 

Here we have additionally massive attacks directed to port 3389 which is the Remote Desktop Sharing of Windows computers.

 

 

7)      Connections to suspicious ports were detected several times:

 

There were several connections to Port 6667 (IRC), Port 21 (for FTP) and Port 8080 in the network. This could be suspicious and should be further investigated. Interesting is also the connection flows to port 22 (SSH) in the second week, which have never occurred before and might be critical as well.

 

 

8)      SSH Attack: Just as an example for the analysis of the suspicious connections to port 22, the following picture is shown:

 

The pink connections are the connections from internal computers of our network using many source ports (blue nodes) directing to a single destination port (22/SSH) of a single external computer. This looks very suspicious and would be a typical scenario, when those machines have been compromised and misused to do brute-force attacks to external SSH servers. The analyst should take a look at that, because the intrusion prevention system doesn’t really seem to block such outgoing attacks.

 

 

9)      Many more interesting events, but we couldn’t investigate further before the deadline, but we would like to continue as future work…

 

MC3.2 – Speculate on one or more narratives that describe the events on the network. Provide a list of analytic hypotheses and/or unanswered questions about the notable events. In other words, if you were to hand off your timeline to an analyst who will conduct further investigation, what confirmations and/or answers would you like to see in their report back to you? Your answer should be no more than 300 words long and may contain up to three additional images.

 

The analysis of the Big Marketing network is quite challenging, because, we do not know when legitimate marketing campaigns were started and when not. Some of the web server traffic is probably just because of the campaigns – which might look quite similar than a DDOS attack to the computer network. The results might actually be the same. When too many people are trying to connect to our customer’s websites, our servers might not be able to handle this burst. Eventually this may lead to a network outage. From a security point of view, these points in time may also hide successful attacks in the shear amount of flow data. There are several hints that some machines might have been illegally accessed via Remote Desktop. For further investigation someone would need syslog data of those machines to identify successful login attempts by unknown individuals using stolen credentials.

 

MC3.3 – Describe the role that your visual analytics played in enabling discovery of the notable events in MC3.1. Describe whether your visual analytics play a role in formulating the questions in MC3.2. Your answer should be no more than 300 words long and may contain up to three additional images.

 

The visual analytics tool was key to identify all notable events. Not explicitly mentioned, it also helped to identify groups of host behaving in a similar way, based on similar underlying time-series data. This information was used to represent clusters in a circular layout using an adapted ClockMap visualization. The thumbnail representation on the left was helpful to spot similar hosts as well. Sadly, we ran out of time, because the tool wasn’t ready until the last day of the challenge deadline. So time was the limiting factor in actually analyzing the whole dataset. However, just with few minutes with the tool, many interesting events could be found. The interactive line chart is also capable of representing different time-series from the related datasets (health and IDS data).